Context-dependent feature analysis with random forests
نویسندگان
چکیده
In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to extend the random forest variable importances framework in order (i) to identify variables whose relevance is context-dependent and (ii) to characterize as precisely as possible the effect of contextual information on these variables. The usage and the relevance of our framework for highlighting context-dependent variables is illustrated on both artificial and real datasets.
منابع مشابه
Banzhaf Random Forests
Random forests are a type of ensemble method which makes predictions by combining the results of several independent trees. However, the theory of random forests has long been outpaced by their application. In this paper, we propose a novel random forests algorithm based on cooperative game theory. Banzhaf power index is employed to evaluate the power of each feature by traversing possible feat...
متن کاملImage Categorization Using Scene-Context Scale Based on Random Forests
Scene-context plays an important role in scene analysis and object recognition. Among various sources of scene-context, we focus on scene-context scale, which means the effective scale of local context to classify an image pixel in a scene. This paper presents random forests based image categorization using the scene-context scale. The proposed method uses random forests, which are ensembles of...
متن کاملCARAF: Complex Aggregates within Random Forests
This paper presents an approach integrating complex aggregate features into a relational random forest learner to address relational data mining tasks. CARAF, for Complex Aggregates within RAndom Forests, has two goals. Firstly, it aims at avoiding exhaustive exploration of the large feature space induced by the use of complex aggregates. Its second purpose is to reduce the overfitting introduc...
متن کاملRandom Forests of Binary Hierarchical Classifiers for Analysis of Hyperspectral Data
Statistical classification of hyperspectral data is challenging because the input space is high in dimension and correlated, but labeled information to characterize the class distributions is typically sparse. The resulting classifiers are often unstable and have poor generalization. A new approach that is based on the concept of random forests of classifiers and implemented within a multiclass...
متن کاملContext-Dependent Data Envelopment Analysis-Measuring Attractiveness and Progress with Interval Data
Data envelopment analysis (DEA) is a method for recognizing the efficient frontier of decision making units (DMUs).This paper presents a Context-dependent DEA which uses the interval inputs and outputs. Context-dependent approach with interval inputs and outputs can consider a set of DMUs against the special context. Each context shows an efficient frontier including DMUs in particular l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1605.03848 شماره
صفحات -
تاریخ انتشار 2016